{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bayesian Statistics Made Simple\n",
    "===\n",
    "\n",
    "Code and exercises from my workshop on Bayesian statistics in Python.\n",
    "\n",
    "Copyright 2016 Allen Downey\n",
    "\n",
    "MIT License: https://opensource.org/licenses/MIT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# If we're running on Colab, install empiricaldist\n",
    "# https://pypi.org/project/empiricaldist/\n",
    "\n",
    "import sys\n",
    "IN_COLAB = 'google.colab' in sys.modules\n",
    "\n",
    "if IN_COLAB:\n",
    "    !pip install empiricaldist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import seaborn as sns\n",
    "sns.set_style('white')\n",
    "sns.set_context('talk')\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from empiricaldist import Pmf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Working with Pmfs\n",
    "\n",
    "Create a Pmf object to represent a six-sided die."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6 = Pmf()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A Pmf is a map from possible outcomes to their probabilities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "for x in [1,2,3,4,5,6]:\n",
    "    d6[x] = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Initially the probabilities don't add up to 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`normalize` adds up the probabilities and divides through.  The return value is the total probability before normalizing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6.normalize()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now the Pmf is normalized."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can compute its mean (which only works if it's normalized)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`choice` chooses a random values from the Pmf."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6.choice(size=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`bar` plots the Pmf as a bar chart"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def decorate_dice(title):\n",
    "    \"\"\"Labels the axes.\n",
    "    \n",
    "    title: string\n",
    "    \"\"\"\n",
    "    plt.xlabel('Outcome')\n",
    "    plt.ylabel('PMF')\n",
    "    plt.title(title)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "d6.bar()\n",
    "decorate_dice('One die')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`d6.add_dist(d6)` creates a new `Pmf` that represents the sum of two six-sided dice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "twice = d6.add_dist(d6)\n",
    "twice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 1:**  Plot `twice` and compute its mean."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 2:** Suppose I roll two dice and tell you the result is greater than 3.\n",
    "\n",
    "Plot the `Pmf` of the remaining possible outcomes and compute its mean."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Bonus exercise:** In Dungeons and Dragons, the amount of damage a [goblin](https://www.dndbeyond.com/monsters/goblin) can withstand is the sum of two six-sided dice.  The amount of damage you inflict with a [short sword](https://www.dndbeyond.com/equipment/shortsword) is determined by rolling one six-sided die.\n",
    "\n",
    "Suppose you are fighting a goblin and you have already inflicted 3 points of damage.  What is your probability of defeating the goblin with your next successful attack?\n",
    "\n",
    "Hint: `Pmf` provides comparator functions like `gt_dist` and `le_dist`, which compare two distributions and return a probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The cookie problem\n",
    "\n",
    "`Pmf.from_seq` makes a `Pmf` object from a sequence of values.\n",
    "\n",
    "Here's how we can use it to create a `Pmf` with two equally likely hypotheses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "cookie = Pmf.from_seq(['Bowl 1', 'Bowl 2'])\n",
    "cookie"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can update each hypothesis with the likelihood of the data (a vanilla cookie)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "cookie['Bowl 1'] *= 0.75\n",
    "cookie['Bowl 2'] *= 0.5\n",
    "cookie.normalize()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And display the posterior probabilities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "cookie"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 3:** Suppose we put the first cookie back, stir, choose again from the same bowl, and get a chocolate cookie.  \n",
    "\n",
    "What are the posterior probabilities after the second cookie?\n",
    "\n",
    "Hint: The posterior (after the first cookie) becomes the prior (before the second cookie)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise 4:** Instead of doing two updates, what if we collapse the two pieces of data into one update?\n",
    "\n",
    "Re-initialize `Pmf` with two equally likely hypotheses and perform one update based on two pieces of data, a vanilla cookie and a chocolate cookie.\n",
    "\n",
    "The result should be the same regardless of how many updates you do (or the order of updates)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution goes here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
